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ABSTRACT 

Pattern  recognition  methods  were  used  to  evaluate  the 
information  content  of  mass  spectrometry  data  obtained  using 
transition  metal  ions  as  an  ionization  source.  Data  sets 
consisting  of  the  chemical  ionization  mass  spectra  for  Fe+  and 
Y+  with  72  organics  (representing  the  six  classes  alkane, 
alkene,  ketone,  aldehyde,  ether,  and  alcohol)  and  24  alkanes 
(representing  the  three  subclasses  linear,  branched,  and  cyclic) 
were  subjected  to  pattern  recognition  analysis  using  a  k-nearest 
neighbor  approach  with  feature  weightings.  The  reactivites  of 
Fe+  and  Y+  toward  the  classes  of  compounds  studied  were 
characterized  using  classification  accuracies  as  a  measure  of 
selectivity,  and  important  chemical  information  was  extracted 
from  the  raw  data  by  empirical  feature  selection  methods.  A 
total  recognition  accuracy  of  81%  was  obtained  for  the 
recognition  of  the  six  organic  classes  and  96%  accuracy  was 
obtained  for  the  recognition  of  the  three  subclasses  of  alkanes. 
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IOTRODDCTION 


Electron  impact  (El)  ionization  mass  spectrometry  has  become 
a  standard  means  for  the  classification  of  unknown  compounds 
according  to  functionality  or  structure  (1-5)  .  The 
differentiation  of  isomeric  molecules,  however,  remains  a 
difficult  problem  and  subtle  differences  in  molecular  structure 
often  cannot  be  distinguished  by  electron  impact.  The  need 
arises,  therefore,  for  a  more  selective  form  of  ionization. 
Chemical  ionization  (Cl)  has  the  potential  for  such  an  increased 
selectivity  since  it  is  possible  to  adjust  the  reactivity  of  the 
Cl  reagent  for  selectivity  in  a  way  that  is  not  possible  for  the 
El  mass  spectrometry  experiment  (6-8)  . 

In  our  laboratory  the  reactivities  of  laser  generated 
transition  metal  ions  toward  various  types  of  compounds  have  been 
studied  for  several  years  (9-13)  .  A  major  goal  of  this  work  has 
been  to  evaluate  the  utility  of  metal  ions  as  selective  reagents 
for  mass  spectral  identification  of  the  functionality  and 
structure  of  unknown  compounds.  In  view  of  the  potentially  large 
data  matrix  generated  from  the  reactions  of  different  metal  ions 
with  various  organic  compounds,  the  application  of  pattern 
recognition  techniques  provides  a  particularly  useful  means  for 
achieving  these  goals. 

Pattern  recognition  has  been  applied  to  a  wide  variety  of 
chemical  problems  and  numerous  reviews  on  the  subject  have  been 
published  (14-21)  .  Some  of  its  more  recent  applications  include: 
recognition  of  organic  compounds  using  Fourier  transform  infrared 
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spectroscopy ,  interpretation  of  gas  chromatography  data,  nuclear 
magnetic  resonance  spectral  interpretation,  and  analysis  of 
electrochemical  systems  (22-32)  .  Many  applications  of  pattern 
recognition  to  mass  spectral  data  have  recently  appeared  in  the 
literature.  Electron  impact  ionization  is  by  far  the  roost  widely 
used  ionization  means  and  has  been  employed  in  studies  ranging 
from  the  analysis  of  complex  mixtures  using  gas 

chromatogr aphy/raass  spectrometry,  to  the  recognition  of  steroids, 
and  the  use  of  mass  spectrometric  data  to  predict  the  biological 
activity  of  antibiotics  (33-36)  .  Pattern  recognition  has  also 
been  applied  to  the  experimental  optimization  of  field-desorption 
and  fast  atom-bombardment  mass  spectrometry  and  for  the  location 
of  homoconjugated  triene  and  tetraene  units  in  aliphatic 
compounds  using  NO  chemical  ionization  (37,38) . 

Because  pattern  recognition  is  a  well  established  tool  for 
interpretation  of  mass  spectral  data,  it  was  the  goal  of  the  work 
described  here  to  use  this  tool  to  enhance  our  understanding  of 
the  information  content  of  a  new  and  important  advance  in 


chemical  ionization  macs  spectrometry.  This  was  a  particularly 
efficient  approach  because  of  the  potentially  enormous  data 
matrices  .  It  also  provided  a  unique  opportunity  to  apply  pattern 
recognition  to  an  emerging  data  base  where  the  scientist  is  near 
the  bottom  of  the  "learning  curve."  This  work  illustrates  how 
such  an  application  can  enhance  the  rate  of  climbing  that 
"learning  curve." 

Cue  of  the  coals  of  pattern  recognition  is  to  minimize  the 


number  of  features  required  to  effect  class  separation  within  a 


data  set  while  maximizing  the  recognition  accuracy  through  the 
elimination  of  features  detrimental  to  class  separation.  Thus, 
an  empirical  feature  selection  algorithm  is  often  used  to  map  a 
classification  problem  down  from  the  space  of  all  features  to  a 
space  of  smaller  dimensionality  which  consists  of  only  important, 
relevant  features.  .  Not  only  does  this  procedure  enhance  the 
ratio  of  patterns  to  features  in  order  to  have  a  statistically 
valid  separation  of  classes  (15) ,  but  it  also  can  provide  new 
chemical  insight  based  on  the  feature  set  selected  to  achieve  a 
given  informational  goal. 

In  this  study  the  use  of  two  metal  ions,  Fe+  and  Y+,  as 
chemical  ionization  reagents  was  evaluated.  Two  different 
empirical  feature  selection  algorithms  were  used  to  extract 
important  features  from  the  data:  successive  subtraction  of 


features  with  total  recognition  accuracy  (SSTRA)  as  the  selection 
criteria  which  employs  constant  weighting  of  the  features,  and 
forward  addition  of  features  using  the  nearest  neighbor  distance 
error  (FANNDE)  as  selection  criteria  which  performs  weighting 


optimization.  A  complete  description  of  these  algorithms  has 


been  given  (39)  .  Using  the  recognition  accuracies  obtained  by 
these  algorithms  with  the  metal  ion  data,  the  sel activities  of 
the  reagents  for  six  organic  functionalities  as  well  as 
sel activities  for  linear,  branched  and  cyclic  olk.nes  were 
evaluated.  The  analytical  utility  of  Fe+  and  Y!,  alone  and 
used  in  combination,  for  the  recognition  of  functionality  and 
structure  was  also  explored.  Trends  in  reactivity  have  been 


inferred  from  the  mis  classified  compounds . 
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EXPERIHEHTAL 


A  home-built  capacitance  bridge  ion  cyclotron  resonance 
(ICR)  mass  spectrometer,  under  the  control  of  an  IBM  9000 
laboratory  computer  (40,41),  and  a  Nicolet  prototype  Fourier 
transform  mass  spectrometer  (FTKS  1000)  were  used  to  generate  the 
Cl  mass  spectral  data  (42).  Details  of  the  ICR  and  FTMS 
experiment  have  been  described  elsewhere  (3S) .  For  most  of  these 
experiments  the  magnetic  field  strength  has  been  held  constant  at 
0.9  Tesla,  and  the  sample  pressure  has  been  maintained  at 
approximately  2  X  10-7  torr,  while  the  trapping  times  ranged  from 
100  to  500  milliseconds.  The  chemicals  used  for  the  collection 
of  the  data  in  these  experiments  were  obtained  commercially  in 
high  purity  and  were  used  as  supplied  except  for  application  of 
multiple  freeze-pump-thaw  cycles  to  remove  non-condensable  gases. 

The  transition  metal  ions  for  chemical  ionization  were 
generated  by  focusing  the  fundamental  or  quadrupled  beam  of  a 
Cunnfn  Ray  :;d : YAG  laser  (1.06u)  onto  a  metal  foil  or  rod  located 
on  one  of  the  ICR  cell  plates.  The  details  of  this  method  for 
generating  natal  ions  have  been  described  elsewhere  (11,13),  as 
well  us  seme  of  the  problems  associated  with  the  generation  cf 
non- thermal  ions  (40,43).  Laser  power  and  beam  diameter  have 
been  adjusted  to  predominantly  form  monopositive  ions.  A 
background  pressure  of  nitrogen  at  approximately  5  X  10-6  torr 
has  been  found  to  stabilize  the  ICR  signal  for  the  capacitance 
bridge  instrument  during  acquisition  of  slow  seen  data  and  has 
beer,  used  for  most  of  the  experiments  using  this  instrument  (40)  . 


The  metal  ion  Cl  mass  spectra  for  each  of  the  compounds 
listed  in  Table  1  has  been  obtained  as  described  above  for  both 
Fe  and  Y  ,  The  data  used  for  the  recognition  of  the  organic 
compounds  consist  of  the  branching  ratios  for  the  primary 
products  generated  by  the  initial  reaction  of  the  metal  ion  of 
interest  with  the  organic  neutral  sample.  Subsequent  reactions 
of  these  product  ions  with  the  neutral  (ie.  secondary,  tertiary, 
etc.  reactions)  were  not  considered.  The  reaction  time  and 


pressure  were  adjusted  such  that  predominantly  primary  products 
were  observed.  The  precursors  of  any  product  peaks  in  question 
have  been  confirmed  by  double  resonance  techniques  and  data  has 
been  collected  under  a  variety  of  different  conditions  (trapping 
tim.es  and  pressures)  to  confirm  the  primary  product  intensity 
ratios.  Much  of  the  data  collected  has  been  repeated  using  both 
the  conventional  ICR  and  FTMS  to  test  the  r epr oducibily  of  the 


data.  Under  these  conditions  the  relative  intensities  of  the 
products  do  not  vary  widely  and  have  been  found  to  be 
reproducible  to  better  than  ICi. 

Six  different  training  sets  were  generated  from  the  data 
collect:!.  The  first  two  contain  the  data  for  the  reactions  of 
72  compounds  representing  the  six  organic  classes  (alkane, 
alkene,  ketone,  aldehyde,  ether  and  alcohol)  with  Fe+  and  Y+ , 
respectively.  The  third  contains  the  combination  of  the  data  in 
the  first  two.  The  fourth  and  fifth  training  sets  contain  the 
data  for  the  reactions  of  24  alkanes  of  the  three  subclasses 
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Fe  and  Y‘ ,  r enrectivel v . 


(linear,  branched  and  cyclic)  with 


And  the  sixth  data  set  contains  the  combination  of  the  data  in 


the  fourth  and  fifth. 

COMPUTER.  SQKEgftRfi 

All  of  the  pattern  recognition  programs  have  been  written 
for  the  IBM  9000  lab  computer  using  IBM  version  CS  9000  FORTRAN 
77  and  employ  the  k-Nearest  Neighbor  algorithm  (KNN)  (44). 

A  classifier  using  this  algorithm  predicts  the  class  of  an 
unknown  to  be  the  same  as  that  of  the  majority  of  its  k-nearest 
neighbors.  Due  to  the  relatively  small  size  of  the  data  sets, 
only  the  first  nearest  neighbor  (k  =  1)  was  used  to  effect 
classification,  and  the  distance  measure  employed  was  the 
Euclidian  distance  in  an  N-dimensional  feature  space.  A  scaling 
factor  or  weighting  of  the  features  has  been  found  to  improve 
clustering  of  the  classes  and  a  scheme  has  been  developed  to 
weight  the  features  differently  for  each  class  (39) .  Using  this 
weighting  scheme,  it  is  possible  to  assess  the  importance  of  a 
given  feature  for  each  class  since  the  uniqueness  of  a  feature 
can  be  inversely  related  to  its  weighting  factor.  The 
Irnva-one-out  (LOO)  algorithm  (45,46)  has  been  used  to 
generate  recognition  accuracies  used  by  the  pattern  recognition 
algor  i  thins . 


RESULTS  AND  DISCUSSION 

El’  n  ct  i  or  al  Jlri>ap_Ilg-CiLgni.t  io  a 
The  optimal  feature  selection  searches  for  the  two 
alcor ithms  with  the  three  organic  data  sets  are  compared 


.in  Tabl 


II.  From  an  examination  of  the  individual  class  recognition 
accuracies  for  Fe"r  with  the  organics,  it  appears  that  iron  ion 
can  distinguish  alcohols  most  readily  from  the  other  compounds. 

In  the  reaction  of  iron  with  alcohols,  FeOH*  is  often  observed 
and  is  unique  to  this  class.  Alkanes  are  separated  from  the 
other  classes  with  75 %  accuracy,  and  a  very  lew  recognition 
accuracy  for  the  ketone  class  indicates  that  iron  cannot 
distinguish  this  class  from  the  others. 

The  nearest  neighbors  of  the  compounds  that  have  been 
nisclassif ied  in  the  best  of  the  two  trials  reported  in  Table  II 
are  identified  in  Table  IV.  In  accordance  with  the  high 
recognition  accuracies,  fewest  misclassif ications  of  the  iron 
data  set  occur  for  the  alkanes  and  the  alcohols.  The  alkenes  are 
misclassif ied  as  alkanes  and  ketones,  two  of  the  misclassif ied 
alkenes  being  closest  to  cyclic  ketones  and  two  closest  to 
branched  alkanes.  In  the  reaction  of  ketones  with  Fe+,  the 
oxygen  is  most  often  lost  as  CO  neutral,  and  often  accompanied  by 
and  ether  hydrocarbons.  Thus  the  corresponding  ions 
observed  are  similar  to  those  observed  with  alkenes.  Two  of  the 
ketones  are  closest  to  alkenes  while  the  others  are  misclassif ied 
:s  aldehyde,  alcohol,  alkane,  and  ether.  Aldehydes  are  also 
poorly  separated  from  the  other  classes,  having 
misclassif ications  as  alcohols,  ketones,  alkene,  and.  alkane. 

Only  three  misclassif ications  occur  for  ethers,  two  as  alcohols 
and  one  as  ketone. 

In  contrast  to  iron,  yttrium  ion  is  particularly  well  suited 
for  differentiating  hydrocarbons  from  oxygen-containino  species. 
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Due  to  a  very  strong  yttrium-oxygen  bond  strength  the  major 
primary  product  for  virtually  all  oxygen-containing 
functionalities  studied  has  been  Y0+.  With  this  single  unique 
feature  it  is  possible  to  linearly  separate  oxygen-containing 
from  non-oxygen-containing  species. 

As  with  iron,  high  recognition  accuracy  (100%  here)  for 
the  alcohols  is  observed.  Three  unique  peaks  are  observed  for 
Y+  with  alcohols,  YO+,  Y0H+,  and  YOH^  allowing  easy 
distinction  from  the  other  oganics.  Alkanes  are  also  recognized 
with  high  accuracy  using  yttrium,  and  in  contrast  to  iron,  the 
ketones  are  recognized  with  high  accuracy  from  the  yttrium  data. 
The  alkenes  are  the  most  poorly  recognized  class  at  50%  accuracy. 

The  contrast  in  the  reactivity  of  Y+  versus  Fe+  becomes 
apparent  from  the  data  presented  in  Table  IV.  The  two  cyclic 
alkanes  misclassif ied  with  yttrium  are  closest  to  cyclic  alkenes. 
Y+  show  a  preference  for  dehydrogenation  as  opposed  to  C-C 
bond  cleavage  and  thus  dehydrogenation  of  the  cyclic  alkanes 
produces  products  similar  to  those  found  with  alkenes. 

With  yttrium,  as  opposed  to  iron,  a  large  improvement  in 
the  separation  of  the  oxygen-containing  organics  from  the 
hydrocarbons  is  observed.  No  mis  classification  of  the  alkenes  as 
oxygenated  species  is  observed  for  Y+  which  is  in  contrast  to 
iron;  the  misclassif ied  alkenes  are  all  closest  to  alkanes, 
and  similarly,  cyclopentene  and  cyclohexene  are  closest  to 
cyclopentane  and  cyclohexane  respectively.  For  ketones  with 
yttrium,  only  methylcy clopr opyl  ketone  is  misclassif ied  and 
appears  closest  to  hexanal .  Three  of  the  aldehydes  are 


misclassif ied  and  are  closest  to  butanone  for  which  the  only 
observed  peak  is  Y0+ .  Since  the  major  peak  for  the  ketones  and 
aldehydes  is  Y0+/  it  is  not  surprising  that  there  is  some 
difficulty  in  distinguising  these  two  classes.  Three  of  the 
misclassif ied  ethers  appear  closest  to  butanone,  on  the  basis  of 
Y0+  intensity,  and  propylene  oxide  is  closest  to  cyclohexanol . 

When  the  data  for  the  Cl  mass  spectra  using  the  two  metals 
are  combined,  an  improvement  in  the  individual  class  accuracies 
is  observed  as  well  as  an  improvement  in  the  total  recognition 
accuracies  for  the  data  sets  of  the  si-x  organic  classes.  Alkanes 
and  alcohols  are  still  recognized  with  high  accuracy  while  a 
large  improvement  in  the  recognition  of  alkenes  is  noted. 

Alkenes  are  recognized  with  a  maximum  of  83%  accuracy  (10  of  the 
12  compounds)  as  compared  to  a  maximum  of  67%  accuracy  with  iron 
alone  and  50%  with  yttrium  alone.  Using  the  data  for  iron  alone, 
it  had  been  difficult  to  distinguish  the  alkenes  from  the  alkanes 
and  oxygen  containing  classes;  using  yttrium  alone  it  had  been 
difficult  to  separate  the  alkanes  from  the  alkenes.  By  adding 
the  yttrium  features  to  those  of  iron,  oxygen-containing  organics 
are  now  distinguishable  from  alkenes  by  the  presence  of  YO+. 

Thus  additional  and  complementary  information  for  the  alkenes  is 
obtained  by  combining  the  features  of  the  two  metals. 

Recognition  of  the  ketones  has  improved  from  33%  accuracy  using 
iron  features  to  67%  accuracy  at  worst,  and  83%  at  best  when  the 
yttrium  data  is  included.  Using  yttrium  features  alone,  however, 
a  92%  recognition  accuracy  for  ketones  is  possible.  Thus  some 
iron  features  which  are  detrimental  to  the  classification  of  the 


ketones  may  have  been  included  indicating  that  in  feature 
selection  for  a  multicategory  classification  problem  there  is 
often  a  trade-off  between  individual  class  recognition 
accuracies.  A  similar  trend  is  noted  for  the  recognition  of 
aldehydes.  No  substantial  improvement  in  the  recognition  of  the 
ethers  is  observed  by  combining  the  data  for  both  metals. 

S.t.r..ugJ^.mLaI..ld£ntif  isatign 

Table  III  lists  the  recognition  accuracies  found  for  the 
three  data  sets  of  the  alkane  subclasses  by  the  selection 
algorithms.  A  high  total  recognition  accuracy  is  obtained  with 
either  metal  ion,  indicating  that  either  can  distinguish  the 
three  subclasses  with  ease.  The  nearest  neighbors  of  the 
misclassif ications  for  the  best  trial  of  SSTRA  or  FANNDE 
performed  on  these  data  sets  are  shown  in  Table  V.  Pew 
misclassif ications  occur  since  a  good  separation  of  the  three 
subclasses  is  observed  for  both  metals.  By  combining  the  data 
for  the  two  metals  reacted  with  the  24  alkanes,  a  total 
recognition  accuracy  of  96%  is  possible,  as  opposed  to  a  maximum 
of  92%  for  iron  alone  and  92%  for  yttrium  alone.  Only 
cyclobutane  is  misclassif iea  as  its  nearest  neighbor  is  propane. 
The  reaction  of  Fe+  with  cyclobutane  yields  95%  FeC0H^+ 
while  propane  reacts  with  iron  to  form  76%  this  ion.  In 
examining  the  feature  weightings  for  the  three  subclasses,  this 
feature  is  most  important  for  and  unique  to  the  linear  subclass. 
The  addition  of  the  yttrium  features  to  those  of  iron  are  not 


sufficient  in  this  case  to  allow  the  correct  classification  of 


cyclobutane. 


In  examining  the  misclassif ications  presented  in  both  Tables 
IV  and  V ,  the  cyclic  compounds  are  most  often  misclassif ied.  For 
example,  the  three  cyclic  compounds,  cyclopentene,  cyclohexene, 
and  vinylcyclohexane  are  misclassif ied  as  cyclic  alkanes  in 
almost  every  trial  in  Table  IV.  The  cyclic  subclass  of  alkanes 

are  most  difficult  to  distinguish  and  are  also  misclassif ied  most 

\ 

often  as  evidenced  in  Table  V. 

In  the  reactions  of  Fe+  and  Y+  with  cyclic  organics, 
often  fewer  products  are  observed  than  in  the  reactions  with 
linear  and  branched  species  which  makes  the  cyclics  very 
difficult  to  distinguish.  Thus,  the  cyclic  compounds  within  an 
organic  class  react  differently  than  the  linear  and  branched 
compounds  in  the  same  class  and  often  appear  more  similar  to  the 
cyclics  of  other  classes. 

I af..or motion,  from  F.gflt.ur.e  Extraction 

Along  with  the  examination  of  misclassif ications  for  the 
evaluation  of  the  selectivities  and  reaction  trends  within 
organic  classes,  information  is  also  contained  in  the  features 
which  are  chosen  to  maximize  the  recognition  accuracy  of  the 
data.  The  feature  corresponding  to  the  attachment  of  H20  to 
Fe+  is  chosen  for  the  best  total  recognition  of  the  Iron  data 
set.  This  feature  is  unique  to  the  reaction  of  iron  with 
alcohols  which  is  evident  from  the  feature  weightings.  Its 
occurence  in  the  set  of  features  which  best  separate  alcohols 
from  the  other  compounds  in  the  data  set  indicates  a  difference 
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in  the  reaction  of  iron  with  other  organics.  With  this 
information  a  reaction  mechanism  involving  the  initial  insertion 
of  Fe+  into  the  HO-R  bond  of  an  alcohol,  with  subsequent  shift 
of  a  6 -hydrogen  to  the  metal,  and  loss  of  the  corresponding 
alkene  neutral  could  be  postulated.  Further  study  of  these 
reactions  could  produce  information  about  the  relative  bond 
strengths  of  iron  to  various  alkenes  versus  water.  Thus  an 
important  piece  of  chemical  information  concerning  chemical 
reactivities  is  extracted  and  highlighted  using  a  purely 
empirical  pattern  recognition  approach. 

The  peak  corresponding  to  YOH+  is  chosen  for  the 
recognition  of  the  organics  using  yttrium.  This  feature  is 
distictive  for  alcohols.  An  important  difference  in  the 
reactivity  of  Y+  toward  alcohols  is  indicated  by  the  selection 
of  this  feature,  since  no  FeOH+  is  observed  for  the  reactions 
of  iron  with  alcohols.  The  initial  insertion  of  Y+  into  the 
R-OH  bond  of  the  alcohol  must  be  much  more  exothermic  than  that 
of  Fe+  in  order  to  provide  enough  internal  energy  to  the 
reaction  complex  to  fragment  to  YOH+  and  a  hydrocarbon  radical 
Thus  an  important  reactivity  difference  between  the  two  metals  i 
highlighted. 

Much  of  the  chemical  information  extracted  through  pattern 
recognition  could  be  inferred  by  a  detailed  manual  analysis  of 
the  data.  The  same  analysis  could  be  accomplished  virtually 
instantaneously  with  the  aid  of  a  super-computer.  It  is 
encouraging  that  the  same  important  features  are  extracted  by 
empirical  algorithms  as  would  be  selected  by  intuition  of  an 
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experienced  analyst.  It  is  believed  however,  that  the  pattern 
recognition  approach  will  uncover  information  which  is  more 
subtle  yet  important,  and  which  is  difficult  to  detect  without 
the  aid  of  this  analysis. 


CONCLUSION 


Pattern  recognition  provides  an  objective  means  whereby 
trends  in  reactivity  of  the  two  metal  ions  along  with  their 
sel ectivities  and  differences  are  examined.  The  selectivities  of 
the  metals  can  be  quantitated  by  examination  of  the  individual 
class  recognition  accuracies  as  well  as  the  misclassif ications. 
Trends  in  reactivity  and  information  regarding  reaction 
mechanisms  can  be  discovered  through  analysis  of  the  features 
which  have  been  extracted  empirically  by  feature  selection. 

In  this  study  72  organics  representing  six  classes  have  been 
recognized  with  an  overall  'ccuracy  of  81%,  and  24  alkanes, 
representing  linear,  branched  and  cyclic  subclasses,  have  been 
recognized  with  96%  total  accuracy  using  the  combined  Cl  mass 
spectral  data  of  yttrium  and  iron  (random  guess  classification 
would  produce  accuracies  of  only  16%  and  33%,  respectively)  . 

Thus  metal  ions  can  be  very  useful  for  analytical 
applications  in  unknown  analysis.  The  data  from  several  metal 
ions  can  be  combined  for  general  unknown  identification  or  a 
single  selective  metal  may  be  used  for  identification  of  a 
specific  organic. 


The  speed  of  the  analysis  and  the  fact  that  the  unknown  does 
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not  have  to  be  present  in  the  data  set  make  pattern  recognition 
attractive  for  online  applications.  Some  further  applications 
for  our  study  will  be  to  expand  the  technique  to  more  complex 
molecules  of  interest,  such  as  multiple  functionalities,  isomeric 
compounds,  and  biologically  active  samples,  and  the  range  of 
metal  ions  will  be  expanded.  Another  long  range  goal  is  the 
application  of  artificial  intelligence  to  the  optimization  of 
experimental  parameters  involved  in  the  metal  ion  FT  MS  experiment 
such  that  the  maximum  information  can  be  obtained. 


ACKNOWLEDGMENT 


Acknowledgment  is  made  to  the  Division  of  Chemical  Sciences 
in  the  Office  of  Basic  Energy  Sciences  in  the  United  States 
Department  of  Energy  (DE-AC02-80ER10689)  for  supporting  the  metal 
ion  chemical  ionization  research  and  to  the  National  Science 
Foundation  (CHE-8310039)  for  providing  funds  for  the  advancement 
of  FTMS  methodology.  R.  A.  Forbes  would  like  to  acknowledge  the 
support  of  Lawrence  Livermore  National  Laboratory  and  the  Office 
of  Naval  Research. 

The  authors  would  also  like  to  thank  fellow  research  group 
members  for  their  help  in  collecting  the  metal  ion  data:  Denley 
Jacobson,  William  Stanton,  Leo  Lech,  and  Beth  Stanko. 


LITERATURE  CITED 


(1)  Justice,  J.  B.;  Isenhour,  T.  L.  Anal.  Chem.  1974,  46, 

223. 

(2)  Abe,  H.;  Jurs,  P.  C.  Anal .  Chem.,  1975,  47,  1829. 

(3)  Domokos,  L.;  Henneberg,  D.;  Weimann,  B.  Anal .  Chim.  Acta 
1983,  150,  37. 

(4)  Droraey ,  R.  G.  Spectra  1984,  10,  3. 

(5)  Domokos,  L.;  Henneberg,  D.;  weimann,  B.  Spectra  1984, 

10,  11. 

(6)  Munson,  M.  S.;  Field,  F.  H.  J.  Am. -Chem.  Soc.  1966,  88, 
2621. 

(7)  Munson,  B.  Anal .  Chem .  1977,  49,  772A. 

(8)  Jennings,  K.  R.  in  "Gas  Phase  Ion  Chemistry";  vol .  2, 
Bowers,  M.  T.,  ed.;  Academic  Press,  New  York,  1979,  123. 

(9)  Burnier,  R.  C.;  Byrd,  G.  D.;  Freiser,  B.  S.  J.  Am.  Chem. 
Soc.  1981,  103,  4360. 

(10)  Byrd,  G.  D.;  Burnier,  R.  C.;  Freiser,  B.  S.  J.  Am.  Chem. 
Soc.  1982,  104,  3565. 

(11)  Jacobson,  D.  B.;  Freiser,  B.  S.  J.  Am.  Chem.  Soc.  1983 
105,  5197. 

(12)  Jacobson,  D.  B.;  Freiser,  B.  S.  J.  Am.  Chem. . Soc.  1983 
105,  7484. 

(13)  Jacobson,  D.  B.;  Freiser,  B.  S.  J.  Am.  Chem.  Soc.  1983 
105,  7492. 

(14)  Jurs,  P.  C.;  Isenhour,  T.  L.  "Chemical  Applications  of 
Pattern  Recognition";  John  Wiley  &  Sons,  New 

Yor k/London/Sydney /Tor  onto,  1975  . 

(15)  Varmuza,  K.  "Pattern  Recognition  in  Chemistry"; 

Spr inger-Verlag,  New  York,  1975. 

(16)  Veress,  G.  E.  Trends  in  Analytical,  Chemistry  1982,  1, 

374. 

(17)  Habbema,  J.  D.  F.  Anal .  Chim .  Acta .  1983,  150,  1  . 

(18)  Hippe,'Z.  Anal.  Chim Acta...  1983,  150,  11. 

(19)  Leegwater,  D.  C.;  Leegwater,  J.  A.  Trends  in  Analytical 
Chemistry  1984,  3,  66. 


(20)  Ritter,  G.  L.;  Lowry,  S.  R.,*  Wilkins,  C.  L.;  Isenhour,  T. 

L.  Anal.  Chem.  1975,  47,  1951. 

(21)  Kowalski,  B.  R.;  Bender,  C.  F.  J.  Am.  Chem.  Soc.  1972, 

94,  5632. 

(22)  Frankel ,  D.  S.  Anal.  Chem.  1984,  56,  1011. 

(23)  Parrish,  M.  E.;  Good,  B .  W. ;  Jeltema,  H.  A.;  Hsu,  F.  S. 

Anal . .  Chim .  _A.ct_a  .  1983,  150,  163. 

(24)  Dunn,  W.  J. ,  III,;  Stalling,  D.  L.;  Schwartz,  T.  R.;  Hogan 

J.  W.;  Petty,  J.  D.;  Johansson,  E.;  Wold,  S.  Anal.  Chem. 
1984,  56,  1308. 

(25)  Jellum,  E.;  Bjornson,  I.;  Nesbakken,  R.;  Johansson,  E.; 
Wold,  S.  Journal  of  Chromatography  1981,  217,  231. 

(26)  Kowalski,  B.  R.;  Bender,  C.  F.  Anal.  Chem.  1972,  44, 

1405. 

(27)  Bink,  J.  C.  W.  G.;  Van't  Klooster,  H.  A.  Anal.  Chim. 

Acta.  1983,  150,  53. 

(28)  Thomas,  Q.  V.;  Perone,  S.  P.  Anal .  Chem.  1977,  49, 

1369. 

(29)  Bugard,  D.  R.;  Perone,  S.  P.  Anal .  Chem.  1978,  50, 

1366. 

(30)  Schachterle,  S.  D.;  Perone,  S.  P.  Anal.  Chem.  1981,  53 
1672. 

(31)  Byers,  W.  A.;  Perone,  S.  P.  Anal.  Chem.  1983,  55,  615. 

(32)  Byers,  W.  A.;  Preiser,  B.  S.;  Perone,  S.  P.  Anal.  Chem. 
1983,  55,  620. 

(33)  Chien,  M.  Anal.  Chem.  1985,  57,  348. 

(34)  Rotter,  H.;  Varmuza,  K.  Anal  .  Chim .  Acta .  1978,  103, 

61. 


(35)  Brent,  D.  A.?  Roth,  B.;  Johnson,  R.  L.;  Brunner,  T.  R. 
Biomed.  Mass  Spectrom.  1981,  8,  440. 

(36)  Tsao,  R.;  Voorhees,  K.  J.  Anal.  Chem.  1984,  56,  368. 

(37)  Van  Der  Greef,  J.;  Tas,  A.  C.?  Bouwman,  J.;  Ten  Noever 
Brauw,  M.  C.;  Schreurs,  W.  H.  P.  Anal.  Chim.  Acta  1983 
150,  45. 

(38)  Brauner,  A.;  Budzikiewicz ,  H.;  Boland,  W.  Ora.  Mass 
Soectrom .  1982,  17,  161. 


(39)  Forbes,  R.A.;  Tews,  E.C.;  Freiser,  B.S.  Journal  of 
Computational  Chemistry,,  submitted,  1985. 

(40)  Wise,  M.  B.  Ph.  D.  Thesis,  Purdue  University,  West 
Lafayette,  1984. 

(41)  Wronka,  J.;  Ridge,  D.  P.  Rev.  Sci.  Instrum.  1982,  53, 
107. 

(42)  For  a  review  of  ICR  sees 

a)  Lehman,  T.  A.;  Bursey,  M.  M.  "Ion  Cyclotron 
Resonance  Spectrometry";  John  Wiley  &  Sons,  New 
York,  1976. 

b)  Beauchamp,  J.  L.;  Ann.  Rev.  Phvs .  Chem.  1971, 
22,  527. 

For  a  review  of  FTMS  see: 

c)  Comisarow,  M.  B.;  in  Lecture  notes  in  chemistry 
1982,  31,  484. 

d)  Wanczek,  K.  P.;  Int.-J.  Mass  Soectrom.  Ion  Phvs 
1984,  60,  11. 

(43)  Kang,  H.;  Beauchamp,  J.  L.  J ■  Phys .  Chem ■  in  press. 

(44)  Cover,  T.  M.;  Hart,  P.  E.  IEEE  Trans,  on  Info.  Theory 
1967,  IT-13,  21. 

(45)  Thomas,  Q.  V.;  DePalma,  R.  D.;  Perone,  S.  P.  Anal.  Chem 
1977,  49,  1376. 


(46)  Fukunaga,  K.  "Introduction  to  Statistical  Pattern 
Recognition";  Academic  Press,  New  York,  1972. 


Table  I.  Compounds  used  for  recognition  experiments. 


ORGANICS  FOR  RECOGNITION  OF  SIX  CLASSES 


ALKANE 


ALKENE 


KETONE 


butane 

1-bntene 

butanone 

pentane 

1-pentene 

2-pentanone 

hexane 

1-hexene 

2-hexanone 

heptane 

E-3 -hexene 

3-heptanone 

2-me thylpentane 

3-me thy 1-1-butene 

4-heptanone 

3-methylpentane 

2-methyl-l-pentene 

3-methyl-2-butanone 

2 ,3-diaethylbutane 

4-aethyl-l-pentene 

3 ,3-dimethyl-2-butanone 

2 , 3 -dime thy lpentane 

2 , 3-dimethyl- 1-butene 

2 , 4 -dime thy 1-3-pentanone 

cyclopentane 

2 ,3-dimethy 1-2-butene 

cyclopentanone 

1-methyl cyclopentane 

cyclopentene 

methylcyclopropyl  ketone 

cyclohexane 

cyclohexene 

3-methyl cyclopentanone 

1 -methyl  cyclohexane 

v iny 1 cy cl ohexane 

cyclohexanone 

ALDEHYDE 

ETHER 

ALCOHOL 

propanal 

ethyl  ether 

ethanol 

butanal 

methyl  butyl  ether 

1-propanol 

pentanal 

ethyl  propyl  ether 

2 -propanol 

hexanal 

propyl  ether 

1 -butanol 

heptanal 

ethyl  butyl  ether 

2-butanol 

octanal 

butyl  ether 

1-heptanol 

2-me thy lbutanal 

isopropyl  ether 

1-octanol 

3-me thy lbutanal 

methyl-t-butyl  ether 

2-methyl-2-propanol 

2 ,2-dimethyl propanal 

sec-butyl  ether 

2-methyl-2-butanol 

benzaldehyde 

ethylene  oxide 

2 ,2-dimetbyl-l-propanol 

cyclohexanecarboxaldehyde 

propylene  oxide 

cyclopentanol 

cyclooctane carboxaldehyde 

tetrahydrof uran 

cyclobexanol 

ALKANES 

FOR  RECOGNITION  OF  THREE 

SUBCLASSES 

LINEAR 

BRANCHED 

CYCLIC 

propane 

butane 

pentane 

heptane 

octane 

nonage 

decane 

dodecane 


methyl propane 
3-methylpentane 
2  ,2 -dime  thylpropane 

2 .3- dime thy lb a tan e 
2  f 3 -dime thylpentane 

2 . 4- dime thylpentane 

2.2. 4- tr imethy lpentane 
2,2,3,3-tetraaethylbutane 


cyclopropane 
cyclobutane 
cyclopentane 
methyl cylopentane 
cyclohexane 
1-methyl cyclohexane 
ethyl cyclopentane 
1,4-dimethyl cyclohexane 


Table  II.  Recognition  accuracies  for  six  organic  classes  with  two 
feature  selection  algorithms  with  three  data  sets. 


FE+  Y+  FE+  AND  Y+ 


SSTRA 

FANNDE 

SSTRA 

FANNDE 

SSTRA 

FANNDE 

ALKANE 

75 

75 

83 

92 

92 

92 

ALKENE 

.  67 

50 

50 

50 

83 

75 

KETONE 

33 

25 

92 

83 

67 

83 

ALDEHYDE 

50 

50 

75 

50 

67 

67 

ETHER 

75 

50 

67 

50 

75 

42 

ALCOHOL 

92 

92 

100 

100 

100 

92 

TOTAL 

65 

50 

78 

71 

81 

75 

Table  III.  Recognition  accuracies  for  three  alkane  subclasses  using 
two  feature  selection  algorithms  with  three  data  sets. 


FE 

+ 

Y+ 

FE+  AND  Y+ 

SSTRA 

FANNDE 

SSTRA 

FANNDE 

SSTRA 

FANNDE 

LINEAR 

100 

100 

88 

88 

100 

100 

BRANCHED 

100 

100 

100 

88 

100 

100 

CYCLIC 

63 

75 

75 

100 

88 

88 

TOTAL 

88 

92 

88 

92 

96 

96 

Table  IV.  Bearcat  neighbor!  of  the  aiadaaslf icatloni  occurring  in  the  baat  trials 
of  tha  tvo  feature  search  algorithms  with  the  six  organic  classes. 


MlSOJtSSiriBO 

COMPOUND 


IRON 

mb>  ittridm 


2 , 3-dimethylbutane 
cyclopentane 
■ethyl  cyclopentane 
cyclohexane 


2-methy lbutanal 
cyclopentene  . 

3 ,3-dimetby 1-2 -butanone 


cyclopentene 

cyclohezene 


cyclopentene 


1-butene 

1- hexene 
E-3-bexene 

3- methyl-l-butene 

2- »ethy 1-1 -pent ene 

4 - methyl -1-pentene 
cyclopentene 
cyclohexene 


methylcydopropyl  ketone 

2 .3- dimethylbutane 

2.3- dlmethylbutane 

3  -methy  1  cy  cl  opentanone 


3 -methyl  pentane 

cyclopentane 

2-methylpentane 

2 ,3-dime  thy lbutane 

cyclopentene 

cyclohexane 


cyclopentane 
methyl  cyclopentane 


butanone 

2-pentanone 

2- hexanone 

3- »ethy 1-2-butanone 

3 .3- dlaethyl-2-butanone 

2 . 4 - dime thy 1- 3-pen tanone 
cy  cl  opentanone 
methylcydopropyl  ketone 
3-»e  thy  Icy  cl  opentanone 


butanal 

cyclopentanol 

2 . 3 - dime thy  1-2-butene 

2.3- dimethylpropanal 
me  thy  lcyclopentane 

E-3-hexene 
ethyl  butyl  ether 
2 -methyl -2-butanol 


hexanal 


pentanal 

1-pentene 

propanal 

pentane 


propanal 

butanal 

pentanal 

hexanal 

3-me  thy  lbutanal 
benxaldehyde 

cyclooctanecarboxaldebyde 


ethanol 

butanone 

1- octanol 

2- hexanone 

cyclopentene 

cyclopentane 


butanone 

butanone 

butanone 


butanone 

butanone 

2- pentanone 

3- *ethyl pentane 


ethyl  butyl  ether 
Isopropyl  ether 
ethylene  oxide 
propylene  oxide 
tetrahydrof uran 


methylcydopropyl  ketone 
2-propanol 


1-propanol 


butanona 
but  anone 
cyclohexanol 
butanone 


butanone 

butanone 

1-pentene 


ethanol 


propylene  oxide 
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